-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add .as_str() to str::Chars and str::CharIndices #27900
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
#27775 proposes removing I think that not only this functionality should be stabilized as-is, but should also be extended to |
@@ -292,6 +292,13 @@ impl<'a> DoubleEndedIterator for Chars<'a> { | |||
} | |||
} | |||
|
|||
impl<'a> Chars<'a> { | |||
#[unstable(feature = "iter_to_slice", issue = "27775")] | |||
pub fn as_str(&self) -> &str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be -> &'a str
, so that you can continue modifying the iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Fixed.
2a8b926
to
bc57162
Compare
I would personally rather hold off on these kinds of conversions until we have our story a bit more thought out. Right now it's pretty piecemeal what iterators can be converted back to the original item, and it would be nice to have consistency across the board instead of adding one-off unstable iterators here and there. I feel that in the case of #27775 this kind of functionality is nice, but so rarely used that it's not worth it right now. Almost no iterators have actual methods on the iterators (they're all just straight implementors of |
The slice iterator is kind of more than fundamental, it's one of the most important types. Sure, we don't have a great story figured out in particular for random access. I'd love to see these methods on both slice and str iterators (str will not have random access of course, but maybe iteration mixed with other searching methods). |
We discussed this at the libs meeting today and the conclusion was to merge this and push forward with these sorts of conversions. @SimonSapin could you make sure that the other string iterators in |
bc57162
to
1234e91
Compare
Ok, I think that’s all of them. However to make it happen I’ve had to:
What do you all think? |
StrSearcherImpl::Empty(EmptyNeedle { position, end, .. }) | | ||
StrSearcherImpl::TwoWay(TwoWaySearcher { position, end, .. }) => { | ||
unsafe { | ||
self.haystack.slice_unchecked(position, end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't work for the StrSearcher: StrSearcher is not a double ended searcher*, just a Searcher and RevereseSearcher: the searching ends are independent. This means that position
and end
may be at arbitrary positions inside haystack i.e not position <= end
.
The remaining haystack is both haystack[position..]
and haystack[..end]
.
Oh and the position
and end
may end up on non-character boundaries depending on how the algorithm is used (next()
vs .next_match()
etc), even if Match and Reject are only emitted with character boundary offsets.
(*) It's not double ended because it would be a very expensive algorithm. Right now, "aa"
in "aaa"
returns a single match (0, 2) from the front and a single match (1, 3) from the back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can work if it has separate methods for the front and back case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I would like to not do this, it's another restriction on the string search algorithm (that it can seek to an appropriate place to cut the remaining part, and that it hasn't jumped far ahead for some reason).
One option is to add the method to |
/// iterator can continue to be used while this exists. | ||
#[unstable(feature = "iter_to_slice", issue = "27775")] | ||
#[inline] | ||
pub fn as_str(&self) -> &'a str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this safely be done? We don't know that iteration isn't in the middle of a multi byte character, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, that’s a good point. I’ll make this one return &'a [u8]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… or remove it entirely because it accesses the inner iterator in Map
.
I think for now it's best to start off with the "easy" iterators like chars/charindices, we can hold off on the inherently more complicated pattern-like iterators for now I think. |
1234e91
to
e33650c
Compare
Chars::as_str
.
Ok, we’re back to just |
See #27775.
r? @alexcrichton